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Abstract.  Human-automation  interaction  has  become  one  of  the  most  important  issues  in 
aviation  safety.  Although  automation  generally  increases  air  travel  safety  and  efficiency, 
sudden  automation  failures  have  produced  tragic  results.  Automation  failure  can  require  a 
previously  disengaged  human  pilot  to  react  immediately  to  counter  a  dangerous  situation.  In 
other  words,  automation  failure  can  inflict  time  pressure  on  human  expertise.  Studies  of 
expert  performance  have  disagreed  on  how  resistant  it  is  to  time  pressure  effects.  This  study 
examined  time  pressure  effects  on  some  of  the  most  expert  chess  players  in  the  world.  The 
results  show  that  time  pressure  can  have  profound  effects  even  on  extremely  high-level  expert 
performance.  Implications  for  the  aviation  domain  are  discussed. 

Keywords:  Expertise,  Time  Pressure,  Automation 


Introduction 

Human-automation  interaction  has  been  a  challenge  for  human  factors  for  quite  some  time 
and  its  relevance  continues  to  grow  (e.g.,  Bainbridge,  1983;  de  Winter  &  Dodou,  2014;  Fitts, 
1951;  Jordan,  1963;  Parasuraman  &  Byrne,  2003,  Sheridan  &  Parasuraman,  2005;  Welford, 
1958).  Among  the  challenges,  automation  failures  can  produce  sudden  and  critical  moments 
in  which  human  expert  performance  is  subjected  to  severe  testing.  Consider  two  examples 
below. 

Example  One  -  On  July  20,  1969,  the  first  manned  spacecraft  to  attempt  a  landing  on  the 
moon,  the  Eagle ,  approached  the  lunar  surface.  Much  of  the  approach  sequence  was  under  the 
automated  control  of  a  computer.  Suddenly  a  computer  flashed  a  light  to  indicate  a  “Program 
Alarm”  and  its  display  froze  with  an  error  code  displayed.  In  the  Eagle  Neil  Armstrong  and 
Buzz  Aldrin  began  troubleshooting  the  problem,  as  did  Michael  Collins  in  the  Apollo 
Command  Module  and  many  people  in  Houston  Control.  As  Buzz  Aldrin  recalled  it  was  a 
tense  moment,  “Back  in  Houston,  not  to  mention  on  board  the  Eagle ,  hearts  shot  up  into  our 
throats  while  we  waited  to  learn  what  would  happen”  (Collins  &  Aldrin,  1975,  p.  212).  Neil 
Armstrong  tested  flight  controls  to  determine  if  they  were  still  working.  Throughout  all  of  this 
troubleshooting,  both  astronauts  had  their  attention  inside  the  vehicle.  At  about  2,000  feet 
Armstrong  looked  out  the  window  and  realized  that  the  descent  trajectory  was  aimed  at  a 
crater  surrounded  by  boulders.  With  an  error  code  still  showing  on  the  display  Armstrong 
took  control  away  from  the  computer  and  guided  the  Eagle  to  a  safe  landing  area.  At 
touchdown,  only  40  to  50  seconds  of  fuel  remained.  In  this  case,  the  human  was  able  to 
compensate  for  the  sudden  automation  alarms  and  successfully  complete  the  historic  mission 
(Mindell,  2011). 
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Example  Two  -  On  June  1,  2009  Air  France  (AF)  Flight  447  was  heading  out  of  Rio  de 
Janeiro  to  cross  the  Atlantic  Ocean  and  land  in  Paris.  Unfortunately,  multiple  large  storm 
systems  were  in  their  path.  Flying  into  the  storms  caused  the  three  pitot  tubes  that  provided 
airspeed  information  to  the  automated  flight  control  system  as  well  as  to  the  cockpit  displays 
to  freeze-up.  The  resulting  lack  of  airspeed  information  caused  the  autopilot  to  disconnect; 
which  meant  the  human  crew  had  to  take  over.  This  was  a  sudden,  challenging,  and  dangerous 
situation,  but  according  to  France’s  Bureau  d’Enquetes  et  d’Analyses  pour  la  securite  de 
l’aviation  civile  (2012),  it  should  have  been  handled  by  the  flight  crew.  However,  in  this 
event,  “The  crew,  progressively  becoming  de-structured,  likely  never  understood  that  it  was 
faced  with  a  ‘simple’  loss  of  three  sources  of  airspeed  information”  (Bureau  d’Enquetes  et 
d’Analyses  pour  la  securite  de  l’aviation  civile,  2012,  p.  199).  During  the  confusion  the  Pilot 
Flying’s  control  inputs  induced  a  stall  which  caused  the  big  airliner  to  lose  all  lift  and  to 
ultimately  plunge  from  a  flying  altitude  of  approximately  35,000  feet  into  the  ocean.  The  time 
from  the  autopilot  disconnecting  to  the  crash  was  approximately  4  minutes  and  23  seconds. 
The  accident  caused  228  fatalities,  including  the  crew. 

The  two  examples  differ  dramatically  in  terms  of  their  outcomes  and  the  systems  involved, 
but  they  are  both  examples  of  an  important  trend  in  aerospace  systems.  More  and  more,  the 
human  is  interacting  with  automated  systems  that  generally  do  a  good  job,  but  sometimes  will 
fail  suddenly  at  unexpected  moments  during  critical  situations.  In  such  rare  moments  it  is  up 
to  the  human  step  in  and  take  control.  At  these  points  the  human’s  expertise  must  cope  with 
the  time  pressure  imposed  by  having  to  act  quickly  to  produce  the  correct  input,  or  at  least  to 
avoid  an  incorrect  act  (Sherry  &  Mauro,  2015).  In  these  time  critical  moments,  the  human 
must  perform  an  immediate,  or  at  least  very  quick,  situation  assessment  that  leads  to  the 
correct  actions. 

But,  how  well  do  human  decision  makers,  even  highly  expert  human  decision  makers,  deal 
with  such  time  pressure?  This  is  not  an  easy  question  to  address  in  typical  laboratory  research 
because  true  experts  in  any  domain,  including  perhaps  especially  aviation,  are  often  difficult 
to  recruit  for  laboratory  research  and  realistic  tasks  that  would  invoke  their  expertise  often 
require  sophisticated  and  expensive  simulators.  To  circumvent  these  difficulties,  the  current 
study  examines  the  effect  of  time  pressure  on  the  performance  of  world-class  experts  in  chess. 
Specifically,  the  performance  of  chess  grandmasters  competing  in  real  chess  tournaments 
inflicting  very  different  levels  of  time  pressure  was  examined. 

An  immediate  and  fair  question  regarding  the  present  experiment  might  be,  “But,  what  does 
chess  playing  have  to  do  with  automation  in  aviation?”  It  is  a  fair  question  because  the  two 
domains  are,  after  all,  very  different.  However,  in  terms  of  the  cognitive  processes  assumed  to 
be  involved  there  is  more  similarity  than  might  be  immediately  obvious.  For  example,  both 
domains  involve  lengthy  training  to  develop  perceptual  abilities  to  recognize  opportunities  or 
dangers.  Both  aviation  and  chess  also  require  the  cognitive  flexibility  to  update  or  revise  a 
plan  when  circumstances  change.  And,  both  aviation  and  chess  will  sometimes  force  the 
expert  to  act  when  there  is  very  little  time  available  to  think.  So,  chess  is  a  reasonable  domain 
for  studying  expert  performance  with  the  expectation  that  the  results  will  generalize  to  other 
domains,  including  aviation. 

Furthermore,  chess  has  several  important  advantages  as  an  especially  attractive  domain  for 
examining  time -pressure  effects  on  highly  expert  performance:  (1)  an  established  rating 
system  for  evaluating  player  expertise,  (2)  competitions  conducted  under  varying  time 
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constraints,  and  (3)  sophisticated  chess  evaluation  software  for  detailed  move-by-move 
performance  analysis. 

Chess  Expertise  Rating  System.  Competitive  chess  players  are  carefully  ranked  by  national 
chess  associations  and  the  Federation  Internationale  des  Echecs  (FIDE)  via  the  Elo  ranking 
system  (Elo,  1978,  Glickman,  1995,  Howard,  2006).  Under  the  Elo  system  a  beginner  is 
expected  to  have  a  ranking  of  approximately  600  to  800  Elo  points.  Anyone  possessing  over 
2000  points  would  generally  be  considered  an  expert  (Vaci,  Gula,  &  Bilalic,  2014).  At  the 
highest  end  of  the  chess  expertise  scale,  grandmasters  would  have  ratings  above  2500  points. 
Points  are  earned  by  defeating  or  drawing  against  higher-rated  players  in  sanctioned  events,  or 
lost  by  losing  or  drawing  against  lower-rated  players.  Elo  ratings  have  been  successfully  used 
to  gauge  expertise  in  previous  studies  of  expertise  (e.g.,  Calderwood,  Klein,  and  Crandall, 
1988;  Chabris  and  Hearst,  2003;  Howard,  2006). 

Chess  Time  Constraints.  Time  allotments  to  the  two  players  is  one  of  the  key  variables 
defining  different  competitive  events.  Each  player  has  his/her  own  clock  that  counts  down 
whenever  it  is  their  move.  If  a  player’s  cumulative  elapsed  time  exceeds  their  alloted  time, 
then  they  lose  the  game.  For  many  years,  the  standard  time  allotment  for  each  player  in  major 
events  was  two  and  a  half  hours  for  the  first  40  moves  with  an  additional  allotment  of  one 
hour  for  every  16  moves  thereafter.  As  mechanical  chess  clocks  were  supplanted  by 
computerised  digital  clocks,  the  time  allotments  now  often  include  additional  time  allotments 
with  each  move  made  (typically  10  to  30  additional  seconds  per  move).  Also,  although  major 
events  will  usually  specify  something  akin  to  the  standard  time  allotment  with  additional  time 
per  move  allotments,  some  events  will  induce  more  time  pressure  on  the  competitors  by 
allowing  only  15  minutes  (typically  refered  to  as  Rapid  events),  or  even  only  5  minutes 
(typically  refered  to  as  Blitz  events),  for  the  entire  game. 

Chess  Evaluation  Software.  Chess  game -playing  and  analysis  software  operates  by  using 
algorithms  to  perform  a  positional  analysis  of  the  current  position,  and  then  similar  analysis  of 
all  legal  moves  can  identify  the  best  possible  move  according  to  the  algorithms.  When 
analysing  a  move  made  by  a  human  player,  the  software  operates  by  subtracting  the 
evaluation  of  the  player’s  actual  move  from  the  evaluation  of  the  best  possible  move 
according  to  the  algorithm  and  produces  a  calculated  blunder  size  of  any  move  made  other 
than  the  best  possible.  During  the  last  decade  or  so  there  has  been  an  emerging  consensus  that 
such  computer  evaluation  of  chess  moves  enables  in-depth  and  unbiased  evaluation  of  chess 
positions  and  moves  (Chabris  &  Hearst,  2003).  As  Garry  Kasparov,  the  former  Chess  World 
Champion  (1985-1993),  described  computerized  chess,  “Your  pocket  calculator  has  no 
trouble  calculating  89  x  97,  and  chess  programs  like  Fritz  and  Junior  are  just  as  quick  to 
produce  the  solutions  to  complicated  tactical  positions.  The  trawl  through  all  the  possibilities 
looking  for  the  path  that  leaves  them  with  the  most  material.  It’s  a  brute  force  system  that 
isn’t  particularly  elegant,  but  in  complex  positions  it’s  undeniably  effective.”  (Kasparov, 
2007,  p.  58).  Furthermore,  Computer-aided  training  has  been  credited  for  improving  the  chess 
play  of  modem  champions  versus  earlier  champions  (Breutigan,  Yusupov,  &  Lutz,  2004).  It 
has  also  become  common  to  evaluate  performance  in  chess  championship  play  by  comparing 
the  chess  moves  made  to  computer  recommendations  during  the  match  (e.g.,  Topalov  & 
Ginchev,  2007).  As  K.  Anders  Ericsson  (2014,  p.  459)  put  it,  “Today’s  chess  programs  are  so 
much  better  than  human  players  that  their  selected  move  can  be  used  as  a  gold  standard  for 
the  best  move.” 

The  effects  of  time  pressure  has  been  examined  in  chess  performance  before,  but  the  results 
have  been  mixed.  For  example,  Calderwood  et  al.,  (1988)  compared  the  rated  move  quality 
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for  three  Class  B  (average  Elo  rating  =  1750)  and  three  Master  level  (average  Elo  rating  = 
2435)  chess  players  playing  under  either  standard  time  constraints  or  blitz  chess  (5  minutes 
allowed  for  each  player).  The  move  quality  for  each  move  in  the  games  (other  than  the 
opening  moves)  were  then  rated  on  a  5-point  scale  by  a  grandmaster  blinded  to  the  expertise 
level  of  the  player  making  the  move  and  the  type  of  event  that  the  game  came  from.  Although 
the  move  quality  ratings  were  significantly  higher  for  the  moves  made  by  the  Master  level 
players,  no  difference  was  detected  between  the  standard  and  blitz  time  controls.  However, 
Chabris  and  Hearst  (2003)  used  computerized  chess  analysis  to  compare  grandmasters  playing 
either  standard  time  control  games  or  rapid  time  controls  (approximately  25-30  minutes 
allowed  for  each  player).  Their  results  showed  that  rapid  chess  led  to  more  and  larger 
blunders.  So,  Calderwood  et  al.  (1988)  found  no  time  pressure  effect  in  lower-rated  chess 
players  confronted  by  higher  time  pressure  than  used  in  the  Charis  and  Hearst  (2003)  research 
that  did  find  a  time  pressure  effect. 

The  different  findings  of  the  Calderwood  et  al.  and  the  Chabris  and  Hearst  experiments  could 
potentially  have  resulted  from  differences  in  their  experimental  designs  and  procedures.  For 
example,  Calderwood  et  al.  had  only  three  players  in  each  of  their  two  groups  and  each 
player’s  data  came  from  a  total  of  eight  games.  This  suggests  that  statistical  power  might  have 
been  quite  limited,  despite  the  fact  that  the  results  found  a  statistically  significant  difference  in 
move  quality  due  to  skill  level  (i.e.,  Master  players  versus  Class-B  players).  In  contrast, 
Chabris  and  Hearst  studied  the  performance  of  23  grandmasters  in  hundreds  of  games. 
Another  possibility  for  the  disparity  in  outcomes  could  be  that  the  computerized  brute  force 
assessment  used  by  Chabris  and  Hearst,  might  have  produced  a  more  thorough  tactical 
evaluation  than  the  subjective  expert  ratings  used  in  the  Calderwood  et  al.  research. 
Calderwood  et  al.  (1988,  pp.  486-487)  themselves  suggested  that  limited  sensitivity  might 
have  been  a  contributor  to  the  failure  to  detect  a  difference  between  standard  and  blitz  play: 
“The  possibility  of  exists  that  our  rating  scale  was  not  sensitive  enough  to  reflect  decrements 
in  move  quality  under  blitz  conditions,  although  the  fact  that  the  scale  did  detect  differences 
in  player  strength  somewhat  belies  this  explanation.” 

Whatever  the  explanation  for  the  differences  in  the  results,  the  effect  of  time  pressure  on 
expert  chess  performance  was  not  settled  by  these  two  studies.  To  help  resolve  the 
disagreement  between  the  previous  research  findings,  the  present  work  used  a  computerized 
chess  move  analysis  procedure  to  examine  top-level  grandmaster  chess  performance  in 
standard  time  control  chess  versus  blitz  chess. 

The  current  experiment  was  explicitly  designed  to  address  two  questions: 

1.  Would  players  at  the  highest  levels  of  expertise  be  susceptible  to  time  pressure 
effects? 

2.  Would  the  degree  of  time  pressure  effect  be  associated  with  the  level  of  expertise  of 
the  individual  grandmasters? 

If  such  highly-rated  chess  players  were  more  likely  to  blunder  under  time  pressure,  it  would 
suggest  that  even  a  skilled  human  pilot  suddenly  confronted  with  time  pressure  due  to  an 
automation  failure  could  also  be  vulnerable  to  performance  degradation. 

Method 

Data  Source 
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One  of  the  most  prestigious  annual  chess  tournaments  is  held  in  the  Dutch  city  of  Wijk  aan 
Zee.  It  is  a  standard  time  control  tournament  that  attracts  many  of  the  top-rated  players  in  the 
world.  In  1999,  the  standard  time  control  tournament  was  supplemented  by  a  blitz 
tournament.  Although  not  all  competitors  from  the  standard  time  control  tournament 
competed  in  the  blitz  tournament,  13  players  did.  The  experimental  data  set  was  created  by 
downloading  games  from  the  tournament  website.  All  available  games  for  both  events  were 
analysed  with  the  Fritz  9  chess  software  (Morsch,  2005).  The  Fritz  software  identified 
blunders  by  calculating  positional  evaluations  resulting  from  all  possible  moves  that  could  be 
made;  if  the  move  that  the  player  selected  was  not  the  best,  then  the  difference  between  that 
move’s  positional  evaluation  and  the  best  possible  positional  evaluation  would  be  the  blunder 
size.  Any  pairing  of  participants  that  had  missing  data  or  failed  to  have  at  least  one  Fritz- 
evaluated  move  for  both  players  in  the  same  event  were  deleted  from  the  present  analysis. 
Thus,  for  any  given  player  the  performance  in  either  event  was  based  on  games  against 
identical  sets  of  opponents.  Ultimately,  58  game  pairs  (with  each  game  providing  data  for  two 
players)  were  identified  and  provided  the  data  for  the  players’  performance  in  the  two  events. 

Participants 

Thirteen  grandmaster  chess  players  that  participated  in  the  1999  Wijk  aan  Zee  chess 
tournament  provided  the  data  for  this  study  (see  Table  1).  Their  mean  Elo  Rating  was  2670 
(95%  Confidence  Interval  (Cl),  [2620,  2719]).  The  range  of  ratings  was  from  2540  to  2812. 
Six  of  the  grandmasters  were  ranked  in  the  top  ten  in  the  world  according  to  the  most  recent 
FIDE  rankings  (British  Chess  Magazine,  January,  1999).  The  then-current  World  Chess 
Champion  (i.e.,  Garry  Kasparov)  was  among  the  top  players  that  competed  in  the  tournament. 

Table  1.  Player  Elo  ratings 


Player  # 

Player  Name 

Elo  Rating 

01 

A.  Yermolinsky 

2597 

02 

D.  Reinderman 

2540 

03 

G.  Kasparov* 

2812 

04 

I.  Sokolov 

2610 

05 

J.  Piket 

2609 

06 

J.  Timman 

2649 

07 

L.  van  Wely 

2636 

08 

P.  Svidler* 

2713 

09 

R.  Kasimdzhanov 

2595 

10 

V.  Anand* 

2780 

11 

V.  Ivanchuk* 

2714 

12 

V.  Kramnik* 

2751 

13 

V.  Topalov* 

2700 

*  =  Ranked  in  the  Top- 10  FIDE  Ratings  (British  Chess  Magazine,  1999) 


Initial  Data  Processing 

All  games  from  the  standard  tournament  and  the  blitz  tournament  were  analysed  by  the  Fritz 
chess  software.  The  software  analysed  any  move  that  it  did  not  recognize  as  part  of  an 
identifiable  opening  sequence  or  that  did  not  have  a  possible  forced  mating  sequence.  For  any 
evaluated  move  the  value  of  the  resulting  position  would  be  calculated;  a  zero  positional 
evaluation  indicated  an  even  game,  positive  values  indicated  an  advantage  for  the  white 
player,  and  negative  value  indicated  an  advantage  for  the  black  player.  To  identify  any 
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blunder,  the  Fritz  software  also  calculated  any  difference  between  the  positional  value  of  the 
move  played  and  the  value  of  the  best  possible  move  according  to  its  calculations. 

Calculation  of  Each  Player’s  Expected  Blunder  per  Move  (EBPM)  scores 

Following  the  Initial  Data  Processing  step,  the  blunder  evaluations  for  each  player  were 
summed  for  each  event  and  divided  by  the  total  number  of  evaluated  moves  to  create  the 
Expected  Blunder  per  Move  (EBPM)  score.  Table  2  shows  the  number  of  analysed  moves 
each  player  made  in  each  event  and  their  mean  EBPM  score  for  each  event. 

Table  2.  Player  Data  by  Time  Control  Event  Type 


Player  # 

Player  Name 

Standard 
Games 
#  of  Moves 

Standard 

Games 

EBPM 

Blitz 
Games 
#  of  Moves 

Blitz 

Games 

EBPM 

01 

A.  Yermolinsky 

243 

0.0878 

274 

0.3364 

02 

D.  Reinderman 

288 

0.3466 

292 

0.4786 

03 

G.  Kasparov 

180 

0.1076 

292 

0.2704 

04 

I.  Sokolov 

164 

0.0821 

327 

0.4544 

05 

J.  Piket 

154 

0.1794 

359 

0.5358 

06 

J.  Timman 

257 

0.1721 

264 

0.4066 

07 

L.  van  Wely 

156 

0.1564 

257 

0.3936 

08 

P.  Svidler 

125 

0.3718 

282 

0.4570 

09 

R.  Kasimdzhanov 

194 

0.2648 

203 

0.7502 

10 

V.  Anand 

106 

0.1745 

186 

0.2585 

11 

V. Ivanchuk 

182 

0.1186 

240 

0.2158 

12 

V.  Kramnik 

95 

0.0807 

180 

0.2797 

13 

V.  Topalov 

255 

0.1945 

339 

0.4454 

Notes.  All  blunders  were  converted  to  absolute  values  to  allow  the  combination  of  blunders 
whether  playing  the  White  or  Black  pieces. 


Results 

Event  Type  Effect  on  EBPM 

To  test  whether  the  increased  time-pressure  exerted  by  the  Blitz  event  actually  degraded  the 
grandmasters’  performance,  a  Mcst  was  calculated  to  compare  the  players’  performance  in 
standard  time  control  games  to  their  Blitz  game  performance.  The  Mcst  found  a  significant 
effect  of  Tournament  Event  Type  on  the  EBPM  measure  (t(n)  =  6.723,  p  <  0.0001,  d  =  1.865). 
As  listed  in  Table  3  and  illustrated  in  Figure  1,  the  average  EBPM  in  the  Blitz  event  was 
substantially  higher  than  in  the  Standard  tournament  event.  As  a  rule  of  thumb  Cohen  (1988) 
suggested  that  any  effect  size  larger  than  0.8  could  be  considered  a  large  effect. 

Table  3.  Mean  EBPM,  Standard  Error,  and  95%  Confidence  Intervals  by  Event  Type 


Event  Type 

Mean  EBPM 

Std.  Error 

95%  Cl 

Standard 

0.180 

0.026 

[0.122,  0.238] 

Blitz 

0.406 

0.040 

[0.320,  0.492] 
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Elo-EBPM  Correlation  Analysis 

Although  the  above  analysis  shows  that  the  group,  as  a  whole,  was  degraded  by  time  pressure, 
it  does  not  explore  whether  the  higher-rated  players  within  this  highly- selected  group  were 
more  resistant  to  time  pressure  than  the  lower-rated  players.  To  test  whether  the  different 
levels  of  expertise  across  the  players  influenced  their  performance  in  the  two  events, 
correlations  between  the  players’  Elo  ratings  and  their  mean  EBPM  performance  in  each  of 
the  tournament  events  were  calculated. 

Interestingly,  the  correlation  was  not  significant  for  the  Standard  tournament  event  analysis 
(rpi)  =  -0.292),  but  the  correlation  was  significant  in  the  Blitz  tournament  analysis  (rpp  =  - 
0.651,  p  <  0.05,  r2  =  0.423). 

The  fact  that  the  standard  time  Elo-EBPM  correlation  was  not  significant  is  potentially 
surprising,  because  the  Elo  ratings  would  have  been  based  on  the  players’  previous 
performances  in  standard  time  events.  However,  as  Vaci  et  al.  (2014)  pointed  out,  studying 
highly- selected  groups  of  experts  can  suffer  from  range  restriction  effects.  So,  weak 
correlations  in  a  group  selected  from  only  among  the  very  highest-rated  chess  players  is 
reasonable. 

But,  given  that  logic,  it  is  perhaps  even  more  interesting  that  the  Blitz  Elo-EBPM  correlation 
was  significant.  This  suggests  that  within  this  group  of  highly-expert  players,  the  relatively 
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small  differences  in  expertise  was  associated  with  changes  in  the  susceptibility  to  time 
pressure  induced  performance  degradation. 

Such  a  connection  between  chess  expertise  and  relative  resistance  to  time  pressure  effects  is 
inconsistent  with  van  Harreveld,  Wagenmakers,  and  van  der  Maas’s  (2007)  finding  that 
strong  chess  players’  ratings  will  become  less  predictive  of  their  performance  as  they  play  in 
events  that  force  them  to  play  faster.  More  data  will  be  required  to  resolve  the  true 
relationship  between  a  chess  player’s  rating  and  their  relative  resistance  to  time  pressure 
effects. 

Discussion 

In  both  aviation  and  competitive  chess,  it  is  justifiably  assumed  that  developing  a  level  of 
expertise  through  training  and  practice  is  important.  As  with  pilots  flying  military  missions  or 
commercial  aircraft,  the  chess  players  in  this  study  can  be  considered  experts.  Indeed,  the 
chess  players  in  this  study  must  be  considered  at  the  most  extreme  high  end  of  chess  playing 
expertise.  The  analogous  level  of  expertise  in  piloting  would  be  difficult  to  define,  but  it 
would  certainly  be  greater  than  simply  qualifying  to  be  a  military  or  commercial  aviation 
pilot. 

One  aspect  of  expert  performance  in  both  domains  would  be  well-developed  pattern 
recognition  (e.g..  Burns,  2004;  Gobet  &  Simon,  1996;  Kaber  &  Endsley,  1997).  That  is,  the 
ability  to  quickly  “size  up”  a  situation  in  order  to  respond  appropriately.  It  is  expected  that  an 
expert  will  very  quickly,  even  immediately,  know  what  to  do  in  situations  where  a  less  expert 
person  would  require  thinking  time  to  assess  different  aspects  of  the  situation  and  carefully 
think  through  the  implications  of  their  combination  to  select  an  appropriate  action.  The 
distinction  between  such  “fast”  versus  “slow”  processes  is  common  in  modem  cognitive 
psychology  (e.g.,  Evans,  2003;  Gobet  &  Simon,  1996;  Kahneman,  2011;  Moxley,  Ericsson, 
Charness,  &  Krampe,  2012;  Schneider  &  Shiffrin,  1977;  van  Harreveld  et  al.,  2007;  Wan, 
Nakatani,  Ueno,  Asamizuyi,  Cheng,  &  Tanaka,  2011). 

Because  the  time  available  to  think  is  seriously  curtailed  or  even  eliminated,  it  is  the  “fast” 
cognitive  processes  of  experts  that  are  challenged  by  playing  blitz  chess  or  dealing  with 
sudden  situations  after  an  automation  failure.  As  noted  above,  earlier  research  examining  the 
effect  of  playing  speed  on  chess  performance  of  expert  players  has  been  mixed  (e.g., 
Calderwood  et  al.,1988;  Chabris  &  Hearst,  2003).  But,  in  the  present  study,  the  limitations  of 
such  capabilities  have  been  clearly  demonstrated.  Dealing  with  the  extreme  demands  of  blitz 
chess,  even  a  group  selected  from  the  most  expert  players  in  the  world  not  only  produced  a 
statistically  significant  increase  in  their  tendency  to  select  weaker  moves;  it  was  a  large  effect 
with  the  size  of  the  expected  blunder  more  than  doubling. 

This  result  would  probably  not  surprise  the  chess  players  that  participated  in  the  events 
analysed.  For  example,  one  of  the  competitors,  the  former  world  chess  champion  Garry 
Kasparov,  noted  that  time  pressure  posed  a  big  challenge  to  even  his  chess  playing;  “The 
worst  enemy  of  the  strategist  is  the  clock.  Time  trouble,  as  we  call  it  in  chess,  reduces  us  all  to 
pure  reflex  and  reaction,  tactical  play.  Emotion  and  instinct  cloud  our  strategic  vision  when 
there  is  no  time  for  proper  evaluation.  Even  the  most  honed  intuition  can’t  entirely  do  without 
accurate  calculations.  A  game  of  chess  can  suddenly  seem  like  a  game  of  chance.”  (Kasparov, 
2007,  pp.  45-46) 
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What  does  this  mean  in  the  aviation  domain?  It  strongly  suggests  that  we  cannot  expect  to  just 
train  pilots  to  a  degree  of  expertise  where  they  can  be  confidently  expected  to  jump  into  an 
unexpected  automation  failure  and  intuitively  react  with  an  optimal  response  while  under  time 
pressure.  The  present  results  suggest  that  either  the  automation  must  be  so  complete  and 
reliable  that  we  can  remove  the  pilot  entirely,  or  we  should  design  systems  that  keep  the  pilots 
in-the-loop  enough  that  they  have  a  better  understanding  of  the  on-going  situation  and  a  better 
chance  to  bring  their  expertise  to  bear  in  situations  where  the  automation  cannot  cope.  If  an 
automation  failure  occurs  while  the  pilot  is  still  somewhat  engaged  in  the  task,  the 
requirement  to  perform  a  complete  and  immediate  analysis  of  the  situation  from  scratch  is 
diminished  which  should  likewise  reduce  the  experience  of  time  pressure. 

Certainly  better  displays  and  improved  communication  between  the  automation  and  the  pilots 
should  also  be  helpful,  but  it  is  doubtful  that  the  system  designer  or  the  real-time  automation 
can  be  expected  to  predict  the  unexpected  situation  well  enough  to  provide  the  best  display  to 
support  a  previously  disengaged  human’s  quick  and  accurate  appreciation  of  what  should  be 
done.  Indeed,  if  the  automation  had  such  clear  information  to  give  to  the  human,  it  would 
probably  be  able  to  handle  the  situation  itself  without  human  intervention. 

Perhaps  no  perfect  solution  exists  for  the  current  automation-failure  challenge,  but  it  might  be 
worth  investigating  whether  aviation  automation  designs  that  allow,  or  even  encourage, 
complete  task  disengagement  by  the  human  operator  should  be  avoided  (Casner,  Geven, 
Recker,  &  Schooler,  2014;  Ebbatson,  Harris,  Huddlestone,  &  Sears,  2010;  Kaber  &  Endsley, 
1997;  Welford,  1958).  Perhaps  most  tellingly,  Onnasch,  Wickens,  Li,  and  Manzey  (2014) 
conducted  an  extensive  meta-analysis  of  the  impact  of  different  levels  of  automation  on 
human  performance  and  one  of  the  key  findings  was  degradation  of  the  human’s  situation 
awareness  and  system  failure  performance  as  the  degree  of  automation  increased. 

Therefore,  if  the  human  pilot  or  operator  is  going  to  be  the  final  line  of  defense  to  ensure 
system  effectiveness  or  safety,  then  semi- automated  systems  should  be  better  designed  to 
maintain  on-going  human-interaction  so  that  any  automation  disengagement  would  be  less 
likely  to  produce  sudden  time  pressure  on  the  human  to  figure  out  what  is  happening  and  then 
what  to  do. 

That  is  too  much  to  ask,  even  from  experts. 
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